{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TF-TRT Inference from Saved Model with TensorFlow 2\n",
    "\n",
    "In this notebook, we demonstrate the process to create a TF-TRT optimized model from a Tensorflow *saved model*.\n",
    "\n",
    "This notebook was designed to run with TensorFlow versions 2.x which is included as part of NVIDIA NGC Tensorflow containers from version `nvcr.io/nvidia/tensorflow:19.12-tf2-py3`, that can be downloaded from the [NGC website](https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow).\n",
    " \n",
    "\n",
    "## Notebook  Content\n",
    "1. [Pre-requisite: data and model](#1)\n",
    "1. [Verifying the orignal FP32 model](#2)\n",
    "1. [Creating TF-TRT FP32 model](#3)\n",
    "1. [Creating TF-TRT FP16 model](#4)\n",
    "1. [Creating TF-TRT INT8 model](#5)\n",
    "1. [Calibrating TF-TRT INT8 model with raw JPEG images](#6)\n",
    " \n",
    "## Quick start\n",
    "We will run this demonstration with a saved Resnet-v1-50 model, to be downloaded and stored at `/path/to/saved_model`.\n",
    "\n",
    "The INT8 calibration process requires access to a small but representative sample of real training or valiation data.\n",
    "\n",
    "We will use the ImageNet dataset that is stored in TFrecords format. Google provide an excellent all-in-one script for downloading and preparing the ImageNet dataset at \n",
    "\n",
    "https://github.com/tensorflow/models/blob/master/research/inception/inception/data/download_and_preprocess_imagenet.sh.\n",
    "\n",
    "\n",
    "To run this notebook, start the NGC TF container, providing correct path to the ImageNet validation data `/path/to/image_net` and the folder `/path/to/saved_model` containing the TF saved model:\n",
    "\n",
    "```bash\n",
    "nvidia-docker run --rm -it -p 8888:8888 -v /path/to/image_net:/data  -v /path/to/saved_model:/saved_model --name TFTRT nvcr.io/nvidia/tensorflow:19.12-tf2-py3\n",
    "```\n",
    "\n",
    "Within the container, we then start Jupyter notebook with:\n",
    "\n",
    "```bash\n",
    "jupyter notebook --ip 0.0.0.0 --port 8888  --allow-root\n",
    "```\n",
    "\n",
    "Connect to Jupyter notebook web interface on your host http://localhost:8888.\n",
    "\n",
    "\n",
    "<a id=\"1\"></a>\n",
    "## 1. Pre-requisite: data and model\n",
    "\n",
    "We first install some extra packages and external dependencies needed for, e.g. preprocessing ImageNet data. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "pushd /workspace/nvidia-examples/tensorrt/tftrt/examples/object_detection/ \n",
    "bash install_dependencies.sh;\n",
    "popd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ['CUDA_VISIBLE_DEVICES']='0'\n",
    "\n",
    "import time\n",
    "import logging\n",
    "import numpy as np\n",
    "\n",
    "import tensorflow as tf\n",
    "print(\"TensorFlow version: \", tf.__version__)\n",
    "\n",
    "from tensorflow.python.compiler.tensorrt import trt_convert as trt\n",
    "from tensorflow.python.saved_model import tag_constants\n",
    "\n",
    "logging.getLogger(\"tensorflow\").setLevel(logging.ERROR)\n",
    "\n",
    "# check TensorRT version\n",
    "print(\"TensorRT version: \")\n",
    "!dpkg -l | grep nvinfer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data\n",
    "We verify that the correct ImageNet data folder has been mounted and validation data files of the form `validation-00xxx-of-00128` are available."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_files(data_dir, filename_pattern):\n",
    "    if data_dir == None:\n",
    "        return []\n",
    "    files = tf.io.gfile.glob(os.path.join(data_dir, filename_pattern))\n",
    "    if files == []:\n",
    "        raise ValueError('Can not find any files in {} with '\n",
    "                         'pattern \"{}\"'.format(data_dir, filename_pattern))\n",
    "    return files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "VALIDATION_DATA_DIR = \"/data\"\n",
    "validation_files = get_files(VALIDATION_DATA_DIR, 'validation*')\n",
    "print('There are %d validation files. \\n%s\\n%s\\n...'%(len(validation_files), validation_files[0], validation_files[-1]))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TF saved model\n",
    "If not already downloaded, we will be downloading and working with a ResNet-50 v1 checkpoint from https://github.com/tensorflow/models/tree/master/official/resnet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "FILE=/saved_model/resnet_v1_50_2016_08_28.tar.gz\n",
    "if [ -f $FILE ]; then\n",
    "   echo \"The file '$FILE' exists.\"\n",
    "else\n",
    "   echo \"The file '$FILE' in not found. Downloading...\"\n",
    "   wget -P /saved_model/ http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v1_fp32_savedmodel_NHWC.tar.gz\n",
    "fi\n",
    "\n",
    "tar -xzvf /saved_model/resnet_v1_fp32_savedmodel_NHWC.tar.gz -C /saved_model "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Helper functions\n",
    "We define a few helper functions to read and preprocess Imagenet data from TFRecord files. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def deserialize_image_record(record):\n",
    "    feature_map = {\n",
    "        'image/encoded':          tf.io.FixedLenFeature([ ], tf.string, ''),\n",
    "        'image/class/label':      tf.io.FixedLenFeature([1], tf.int64,  -1),\n",
    "        'image/class/text':       tf.io.FixedLenFeature([ ], tf.string, ''),\n",
    "        'image/object/bbox/xmin': tf.io.VarLenFeature(dtype=tf.float32),\n",
    "        'image/object/bbox/ymin': tf.io.VarLenFeature(dtype=tf.float32),\n",
    "        'image/object/bbox/xmax': tf.io.VarLenFeature(dtype=tf.float32),\n",
    "        'image/object/bbox/ymax': tf.io.VarLenFeature(dtype=tf.float32)\n",
    "    }\n",
    "    with tf.name_scope('deserialize_image_record'):\n",
    "        obj = tf.io.parse_single_example(record, feature_map)\n",
    "        imgdata = obj['image/encoded']\n",
    "        label   = tf.cast(obj['image/class/label'], tf.int32)\n",
    "        bbox    = tf.stack([obj['image/object/bbox/%s'%x].values\n",
    "                            for x in ['ymin', 'xmin', 'ymax', 'xmax']])\n",
    "        bbox = tf.transpose(tf.expand_dims(bbox, 0), [0,2,1])\n",
    "        text    = obj['image/class/text']\n",
    "        return imgdata, label, bbox, text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from preprocessing import vgg_preprocess as vgg_preprocessing\n",
    "def preprocess(record):\n",
    "        # Parse TFRecord\n",
    "        imgdata, label, bbox, text = deserialize_image_record(record)\n",
    "        #label -= 1 # Change to 0-based if not using background class\n",
    "        try:    image = tf.image.decode_jpeg(imgdata, channels=3, fancy_upscaling=False, dct_method='INTEGER_FAST')\n",
    "        except: image = tf.image.decode_png(imgdata, channels=3)\n",
    "\n",
    "        image = vgg_preprocessing(image, 224, 224)\n",
    "        return image, label"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Define some global variables\n",
    "BATCH_SIZE = 64\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"2\"></a>\n",
    "## 2. Verifying the orignal FP32 model\n",
    "We demonstrate the conversion process with a Resnet-50 v1 model. First, we inspect the original Tensorflow model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SAVED_MODEL_DIR =  \"/saved_model/resnet_v1_fp32_savedmodel_NHWC/1538686669/\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We employ `saved_model_cli` to inspect the inputs and outputs of the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!saved_model_cli show --all --dir $SAVED_MODEL_DIR"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This give us information on the input and output tensors as `input_tensor:0` and `softmax_tensor:0` respectively. Also note that the number of output classes here is 1001 instead of 1000 Imagenet classes. This is because the network was trained with an extra background class. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "INPUT_TENSOR = 'input_tensor:0'\n",
    "OUTPUT_TENSOR = 'softmax_tensor:0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we define a function to read in a saved mode, measuring its speed and accuracy on the validation data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def benchmark_saved_model(SAVED_MODEL_DIR, BATCH_SIZE=64):\n",
    "    # load saved model\n",
    "    saved_model_loaded = tf.saved_model.load(SAVED_MODEL_DIR, tags=[tag_constants.SERVING])\n",
    "    signature_keys = list(saved_model_loaded.signatures.keys())\n",
    "    print(signature_keys)\n",
    "\n",
    "    infer = saved_model_loaded.signatures['serving_default']\n",
    "    print(infer.structured_outputs)\n",
    "\n",
    "    # prepare dataset iterator\n",
    "    dataset = tf.data.TFRecordDataset(validation_files)   \n",
    "    dataset = dataset.map(map_func=preprocess, num_parallel_calls=20)\n",
    "    dataset = dataset.batch(batch_size=BATCH_SIZE, drop_remainder=True) \n",
    "\n",
    "    print('Warming up for 50 batches...')\n",
    "    cnt = 0\n",
    "    for x, y in dataset:\n",
    "        labeling = infer(x)\n",
    "        cnt += 1\n",
    "        if cnt == 50:\n",
    "            break\n",
    "\n",
    "    print('Benchmarking inference engine...')\n",
    "    num_hits = 0\n",
    "    num_predict = 0\n",
    "    start_time = time.time()\n",
    "    for x, y in dataset:\n",
    "        labeling = infer(x)\n",
    "        preds = labeling['classes'].numpy()\n",
    "        num_hits += np.sum(preds == y)\n",
    "        num_predict += preds.shape[0]\n",
    "        \n",
    "    print('Accuracy: %.2f%%'%(100*num_hits/num_predict))\n",
    "    print('Inference speed: %.2f samples/s'%(num_predict/(time.time()-start_time)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "benchmark_saved_model(SAVED_MODEL_DIR, BATCH_SIZE=BATCH_SIZE)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"3\"></a>\n",
    "## 3. Creating TF-TRT FP32 model\n",
    "\n",
    "Next, we convert the native TF FP32 model to TF-TRT FP32, then verify model accuracy and inference speed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "FP32_SAVED_MODEL_DIR = SAVED_MODEL_DIR+\"_TFTRT_FP32/1\"\n",
    "!rm -rf $FP32_SAVED_MODEL_DIR\n",
    "\n",
    "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n",
    "    precision_mode=trt.TrtPrecisionMode.FP32)\n",
    "\n",
    "converter = trt.TrtGraphConverterV2(\n",
    "    input_saved_model_dir=SAVED_MODEL_DIR,\n",
    "conversion_params=conversion_params)\n",
    "converter.convert()\n",
    "\n",
    "converter.save(FP32_SAVED_MODEL_DIR)\n",
    "\n",
    "\n",
    "benchmark_saved_model(FP32_SAVED_MODEL_DIR, BATCH_SIZE=BATCH_SIZE)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"4\"></a>\n",
    "## 4. Creating TF-TRT FP16 model\n",
    "\n",
    "Next, we convert the native TF FP32 model to TF-TRT FP16, then verify model accuracy and inference speed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "FP16_SAVED_MODEL_DIR = SAVED_MODEL_DIR+\"_TFTRT_FP16/1\"\n",
    "!rm -rf $FP16_SAVED_MODEL_DIR\n",
    "\n",
    "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n",
    "    precision_mode=trt.TrtPrecisionMode.FP16)\n",
    "\n",
    "converter = trt.TrtGraphConverterV2(\n",
    "    input_saved_model_dir=SAVED_MODEL_DIR,\n",
    "conversion_params=conversion_params)\n",
    "converter.convert()\n",
    "\n",
    "converter.save(FP16_SAVED_MODEL_DIR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "benchmark_saved_model(FP16_SAVED_MODEL_DIR, BATCH_SIZE=BATCH_SIZE)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"5\"></a>\n",
    "## 5. Creating TF-TRT INT8 model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Creating TF-TRT INT8 inference model requires two steps:\n",
    "\n",
    "- Step 1: Prepare a calibration dataset\n",
    "\n",
    "- Step 2: Convert and calibrate the TF-TRT INT8 inference engine\n",
    "\n",
    "### Step 1: Prepare a calibration dataset\n",
    "\n",
    "Creating TF-TRT INT8 model requires a small calibration dataset. This data set ideally should represent the test data in production well, and will be used to create a value histogram for each layer in the neural network for effective 8-bit quantization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_calibration_batches = 2\n",
    "\n",
    "# prepare calibration dataset\n",
    "dataset = tf.data.TFRecordDataset(validation_files)   \n",
    "dataset = dataset.map(map_func=preprocess, num_parallel_calls=20)\n",
    "dataset = dataset.batch(batch_size=BATCH_SIZE, drop_remainder=True) \n",
    "calibration_dataset = dataset.take(num_calibration_batches)\n",
    "\n",
    "def calibration_input_fn():\n",
    "    for x, y in calibration_dataset:\n",
    "        yield (x, )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### Step 2: Convert and calibrate the TF-TRT INT8 inference engine\n",
    "\n",
    "The calibration step may take a while to complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set a directory to write the saved model\n",
    "INT8_SAVED_MODEL_DIR =  SAVED_MODEL_DIR + \"_TFTRT_INT8/1\"\n",
    "!rm -rf $INT8_SAVED_MODEL_DIR\n",
    "\n",
    "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n",
    "    precision_mode=trt.TrtPrecisionMode.INT8)\n",
    "\n",
    "converter = trt.TrtGraphConverterV2(\n",
    "    input_saved_model_dir=SAVED_MODEL_DIR,\n",
    "conversion_params=conversion_params)\n",
    "converter.convert(calibration_input_fn=calibration_input_fn)\n",
    "\n",
    "converter.save(INT8_SAVED_MODEL_DIR)\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Benchmarking INT8 saved model\n",
    "\n",
    "Finally we reload and verify the accuracy and performance of the INT8 saved model from disk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "benchmark_saved_model(INT8_SAVED_MODEL_DIR, BATCH_SIZE=BATCH_SIZE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!saved_model_cli show --all --dir $INT8_SAVED_MODEL_DIR"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"6\"></a>\n",
    "## 6. Calibrating TF-TRT INT8 model with raw JPEG images\n",
    "\n",
    "As an alternative to taking data in TFRecords format, in this section, we demonstrate the process of calibrating TFTRT INT-8 model from a directory of raw JPEG images. We asume that raw images have been mounted to the directory `/data/Calibration_data`.\n",
    "\n",
    "As a rule of thumb, calibration data should be a small but representative set of images that is similar to what is expected in deployment. Empirically, for common network architectures trained on imagenet data, calibration data of size 500-1000 provide good accuracy. As such, a good strategy for a dataset such as imagenet is to choose one sample from each class. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_directory = \"/data/Calibration_data\"\n",
    "calibration_files = [os.path.join(path, name) for path, _, files in os.walk(data_directory) for name in files]\n",
    "print('There are %d calibration files. \\n%s\\n%s\\n...'%(len(calibration_files), calibration_files[0], calibration_files[-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a helper function to read and preprocess image from JPEG file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def parse_file(filepath):\n",
    "    image = tf.io.read_file(filepath)\n",
    "    image = tf.image.decode_jpeg(image, channels=3)\n",
    "    image = vgg_preprocessing(image, 224, 224)\n",
    "    return image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "num_calibration_batches = 2\n",
    "\n",
    "# prepare calibration dataset\n",
    "dataset = tf.data.Dataset.from_tensor_slices(calibration_files)\n",
    "dataset = dataset.map(map_func=parse_file, num_parallel_calls=20)\n",
    "dataset = dataset.batch(batch_size=BATCH_SIZE)\n",
    "dataset = dataset.repeat(None)\n",
    "calibration_dataset = dataset.take(num_calibration_batches)\n",
    "\n",
    "def calibration_input_fn():\n",
    "    for x in calibration_dataset:\n",
    "        yield (x, )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we proceed with the two-stage process of creating and calibrating TFTRT INT8 model.\n",
    "\n",
    "### Convert and calibrate the TF-TRT INT8 inference engine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set a directory to write the saved model\n",
    "INT8_SAVED_MODEL_DIR =  SAVED_MODEL_DIR + \"_TFTRT_INT8/2\"\n",
    "!rm -rf $INT8_SAVED_MODEL_DIR\n",
    "\n",
    "conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(\n",
    "    precision_mode=trt.TrtPrecisionMode.INT8)\n",
    "\n",
    "converter = trt.TrtGraphConverterV2(\n",
    "    input_saved_model_dir=SAVED_MODEL_DIR,\n",
    "conversion_params=conversion_params)\n",
    "converter.convert(calibration_input_fn=calibration_input_fn)\n",
    "\n",
    "converter.save(INT8_SAVED_MODEL_DIR)\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As before, we can benchmark the speed and accuracy of the resulting model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "benchmark_saved_model(INT8_SAVED_MODEL_DIR)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "In this notebook, we have demonstrated the process of creating TF-TRT inference model from an original TF FP32 *saved model*. In every case, we have also verified the accuracy and speed to the resulting model. \n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
